Analysis and Implementation of an Ajax-enabled Web Crawler
نویسندگان
چکیده
This paper analyzes a web crawler for the Web 2.0 network, which presents new challenges. A comprehensive overview on various programs and strategies is presented, which includes the design and realization of Ajax reptiles. Experimental verification indicates that an Ajax Crawler can effectively obtain Ajax dynamic pages. The proposed Ajax Crawler is then compared with common Ajax reptiles in terms of their download speeds.
منابع مشابه
GDist-RIA Crawler: A Greedy Distributed Crawler for Rich Internet Applications
Crawling web applications is important for indexing, accessibility and security assessment. Crawling traditional web applications is an old problem, for which good and efficient solution are known. Crawling Rich Internet Applications (RIA) quickly and efficiently, however, is an open problem. Technologies such as AJAX and partial Document Object Model (DOM) updates only make the problem of craw...
متن کاملPrioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملEnabling automatic testing of Modern Web Applications using Testing Plug-ins
Modern web applications are very dynamic in nature with rich user experience. Such applications typically use Web 2.0 and Asynchronous JavaScript and XML (AJAX) technologies. These applications are very different from conventional web applications as they use stateful C/S communication in an asynchronous fashion. The use agent is able to communicate with web server without explicit form submiss...
متن کاملScheme for Client-Side Scripting in Mobile Web Browsing or AJAX-Like Behavior Without Javascript
I present an implementation of Scheme embedded within a Web browser for wireless terminals. Based on a port of TinyScheme integrated with RocketBrowser, an XHTML-MP browser running on Qualcomm BREW-enabled handsets. In addition to a comparison of the resulting script capabilities, I present the changes required to bring TinyScheme to Qualcomm BREW, including adding support for BREW components a...
متن کاملAJAXSearch: crawling, indexing and searching web 2.0 applications
Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search in dynamic pages, however, is either inexistent or far from perfect. AJAX and Rich Internet Application are such applications. They are increasingly frequent on the Web (in YouTube, Amazon, GMail, Yahoo!Mail) or mobile devices and are offering a high degree of interactivity to the user, by seamlessly lo...
متن کامل